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Abstract 



A method is given for solving an optimal H2 approximation problem for SISO linear time-invariant stable systems. The 
method, based on constructive algebra, guarantees that the global optimum is found; it does not involve any gradient-based search, 
and hence avoids the usual problems of local minima. We examine mostly the case when the model order is reduced by one, and 
when the original system has distinct poles. This case exhibits special structure which allows us to provide a complete solution. 
The problem is converted into linear algebra by exhibiting a finite-dimensional basis for a certain space, and can then be solved 
by eigenvalue calculations, following the methods developed by Stetter and Moller [29], [34]. The use of Buchberger's algorithm 
is avoided by writing the first-order optimality conditions in a special form, from which a Grobner basis is immediately available. 
Compared with our previous work [18], the method presented here has much smaller time and memory requirements, and can 
therefore be applied to systems of significantly higher McMillan degree. In addition, some hypotheses which were required in the 
previous work have been removed. Some examples are included. 

I. Introduction 

In this paper we consider the problem of approximating a stable linear dynamic system by one of lower McMillan degree. 
We take the L2 norm as the measure of approximation, namely we solve the problem 



where h G Ai(N) is the impulse response of the original system, h is the impulse response of the approximating system, and 
M. (N) denotes the set of impulse responses of minimal stable systems of McMillan degree N. This problem is equivalent to 
the problem of finding an approximation which minimizes the H2 norm of the error in the frequency response: 



where H and H are the frequency responses of the original and the approximating systems, respectively, and Tt(N) denotes 
the set of Fourier transforms of elements of A4(N). Throughout this paper we consider SISO systems only, and we solve the 
H2 problem for n = N — 1. We assume mostly that the 'true' system has distinct poles. From section II onwards we will work 
with the set ESV of rational transfer functions, whose impulse responses are elements of A4 (N) and frequency responses are 
elements of TL(N), and we will look for approximants in the set 

The H2 problem has many applications and connections to other problems in systems and control theory, including model 
simplification, system identification, and approximate model matching. Many publications treat this problem, such as [2], [28] 
and the references cited therein. An early publication on this problem, possibly the oldest, is [1]. We investigate the H2 
approximation problem by means of constructive algebra, in particular by exploiting the theory of polynomial ideals. There is 
an increasing use of computer algebra in systems theory, see e.g. [14], [25], [31], [33], [37], [38]. This paper makes a further 
contribution to this trend. 

We believe that the significance of this paper lies in its introduction of a promising new approach to model reduction 
problems. We emphasise that this approach does not involve gradient-based search methods, and hence avoids the usual 
problems associated with local minima. Our use of constructive algebra leads to an algorithm with the important attribute that 
the solution found is guaranteed to be the global optimum. In [18] two of the present authors already applied constructive 
algebra to the H2 approximation problem, taking an approach based on state-space realizations of the linear systems involved. 
By contrast, the approach here is based on a form of the first-order necessary conditions for optimality which arises from 
transfer function descriptions of both the original and the approximating systems. The solution method which we develop here 
is quite different from that developed in [18]. Computationally it is much more efficient, as regards both memory and time 
requirements. This allows us to tackle problems with significantly larger values of N, as can be seen from the examples. 
Furthermore, [18] required some technical hypotheses relating to the finiteness of the number of critical points, which are not 
needed in this paper. 
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In addition to finding the global optimum, our approach gives important new theoretical insight into the structure of the 
reduction-by-one problem. In particular, we show that the number of critical points is finite, and in fact no greater than 2 N — 1. 
The computational complexity is high, and the method involves some delicate numerical steps, so we do not claim that our 
approach is a rival, at this stage, for conventional numerical approaches in routine applications to engineering problems. But 
even now it has some practical uses, for example as a generator of 'benchmark' solutions against which other methods can 
be tested. Since, as will be seen, it relies on eigenvalue calculations for a set of matrices which can be constructed in a 
rather straightforward manner, our approach is in some ways comparable with Glover's method for solving the Hankel-norm 
approximation problem [16]. Promising developments which combine the current approach with numerical methods for solving 
large eigenvalue problems in related applications are reported in [7]. 

In the next section we obtain a special representation of the first-order necessary conditions for optimality. This representation 
is in the form of a set of quadratic equations, which take a special form which we call diagonal quadratic. The following 
section investigates such diagonal quadratic equations. It is shown that the polynomials which define these equations form a 
Grobner basis for the ideal generated by themselves. It is further shown that these equations have a finite set of solutions, 
and that in consequence a certain space is finite-dimensional. Furthermore a basis for this space is identified, which allows a 
solution method based on linear algebra. 

We then present such a method of solving a system of polynomial equations. This method relies on obtaining a Grobner 
basis, but in the application to the specific H2 problem considered here, such a basis is immediately available. This method 
of solving polynomial equations is of general use and it is known in the computer algebra community, see [10], [29], [34] 
and the references therein. The development here is self-contained and starts with constructing a matrix solution of the system 
of polynomial equations, from which the desired solutions can be found by solving a collection of eigenvalue-eigenvector 
problems. These eigenproblems can be solved either by numerical methods or by symbolic methods. We believe from a system 
theoretic point of view it is very natural to start with the construction of a matrix solution; in fact the matrices obtained are 
generalised companion matrices. 

A section then applies this method to the solution of the H 2 problem, for the case n = N — 1 and distinct poles of the 
original system. How to treat repeated poles is outlined in a short section. This is followed by two examples. 

II. A SPECIAL REPRESENTATION OF THE FIRST ORDER CONDITIONS. 

In this section the first order conditions for a class of H2 model order reduction problems will be considered. Studying 
the outcomes of a computer algebra calculation in which a set of symbolic first order conditions for the H2 model order 
reduction problem was brought into a recursive form, it was observed that the occurrence of multiple poles in the original 
system gave rise to a certain singularity in the first order equations. This was the motivation for investigating the class of 
systems with distinct poles separately from the class of systems with multiple poles. The continuous-time case is treated here, 
but the discrete-time case is in fact the same up to isometry (see e.g. [21], Theorem 5.4-3; [22], Theorem 3.2-22). 

Now let us set up the problem. In fact there are several equivalent formulations. One formulation which is closest to the 
form of the first order conditions that we use in this paper is as follows. (For other formulations refer to the literature, e.g. [18]) 

Consider a continuous-time stable SISO linear system. Without loss of generality we can assume the system to be strictly 
proper, because if it is not then the direct feedthrough term of the optimal H2 approximant will be equal to the direct feedthrough 
term of the original system, and the strictly proper part of the optimal approximant will not be influenced at all (nor will the 
strictly proper part of any of the critical points) by the value of the direct feedthrough term. Let the transfer function of the 
original system (i.e. the system that is to be reduced in order) be given by e(s)/d(s), where e is some polynomial with real 
coefficients of degree at most N — 1, and d is a monic polynomial with real coefficients of degree N with all its zeroes (i.e. 
poles of the transfer function) Si, 62, ■ ■ ■ , Sn, within the open left half of the complex plane. Assume that e and d are coprime. 

Consider the rational function ■ It is an element of the Hardy space H2 of square summable functions on the imaginary 
axis which are analytic on the open right halfplane and satisfy a certain continuity requirement on the imaginary axis(cf. [26]). 
In this paper we work with the subspace of real rational functions in i/ 2 - This subspace consists of all strictly proper real 
rational functions which have the property that all the poles lie in the open left half plane. The space _ff 2 is in fact a Hilbert 
space with corresponding norm ||.|| 2 of a function t e H 2 given by 

1 f°° 

11*112 = 2^: J l*MI 2 <k> 

Consider the differentiable manifold T,S n of all real rational functions in H2 such that b(s) and a(s) are coprime, the 
coefficients of a(s) and b(s) are real and a(s) is a Hurwitz polynomial of degree n. For more information about the structure 
of this differentiable manifold see for example [6] and [23] and the references given there. The iJ 2 model order reduction 
problem can now be formulated as the following optimization problem: 
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Remark. It is well-known that the distance squared 



is in fact a rational function of the coefficients of the 



e(s) b(s) 
d(s) a ( s ) 

numerator and denominator polynomials (see the literature, e.g. [21]; in order to obtain explicit rational function formulas one 
could use the methods proposed in [24] ) 

A well-known first order necessary condition for optimality of an n— th order transfer function b(s)/a(s) with real coefficients, 
as an approximant in H2 is the following. First let us present a geometric formulation. 



If 



b(s) 
a(s) 



is an optimal approximant of the transfer function with respect to the Hi— norm, then the difference 
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is perpendicular to the tangent plane at the manifold of transfer functions of order n at the point - 
It is well-known (and not hard to show) that the tangent space consists of all strictly proper rational functions of the form 
, where p is a polynomial of degree at most 2n — 1. From the theory of Hardy spaces it follows that the orthogonal 
complement in H2 of this vector space is given by a(—s) 2 H2, i.e. all #2 —functions which can be written as the product of 
the function a(— s) 2 and an arbitrary H2 function. Combining this with the first order conditions given above, it follows that 



the numerator of the difference 
be written down as follows: 
Let n < N. If ^ 
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has to be divisible by a(—s) . (Cf. [28], see also [2], [3]). Algebraically this can 



, is an optimal approximant within the class of transfer functions of order n in H2, of the transfer function 
in i?2, with respect to the H2— norm, then there exists a polynomial q(s) of degree at most N — (n + 1) such that 

a(-s) 2 q(s). 



e(s)a{s) — b(s)d(s) 



(3) 



Let us now specialise to the case in which n = N — 1 and the original system has distinct poles, i.e. the multiplicity of 
each of the N = n + 1 poles Si, ... ,8^ is one. The rest of this paper concentrates mostly on this case. Now the polynomial 
q(s) has degree zero, so it reduces to a constant q(s) = qo- The unknowns in the polynomial equation are the polynomials 
b(s), a(s) and the number qo. Although qo is only an auxiliary variable we will not eliminate it. Note that once the polynomial 
a and the number qo are known, the polynomial b follows from the formula 



b(s) 



e(s)a(s) - qoa(-s)' 2 
d(s) 



Substituting s = 84, i = 1, . 



, N in the polynomial equation (01 one obtains: 

e(8i)a(5i) = a(-5i) 2 q a , i = l,... 



N. 
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Note that the polynomials appearing here do not depend on the polynomial b, due to the fact that d(Si) = for each 
i = 1, . . . ,N. Further note that the possibility qo = can be excluded on the grounds that if go = then either e(<5j) = 
for some value of i 6 {1, . . . , N}, which implies that there is pole-zero cancellation in the original transfer function and the 
order of the transfer function will be smaller than N, which can be ruled out without loss of generality, or otherwise it would 
follow that a(s) = in N different points, namely at s = 5i,i = 1, . . . , N, which together with the fact that a has degree 
n = N — 1 would imply that a = 0, which is in contradiction with the assumption that a is monic. It follows that qo ^ for 
each value of qo that corresponds to a solution of the first order equations. Therefore multiplying both sides of the polynomial 
equation with qo the first order conditions can be rewritten as 



e(Si)a(Si)qo = {a(-Si)qo) , i = 1, . . . , N, q ^ 0. 



(6) 



The polynomial a is monic, so qo is the leading coefficient of the non-zero polynomial a := qoa. Using this notation the first 
order equations can be rewritten as 



e(Si)a(6i) = a(-5iY, i = 1, . . . ,N, a + 0. 



(7) 



The idea is now to consider this as an equation in the unknowns a(— 6i),i = 1, ... ,7V. In order to do this explicitly we 
need to express the sequence of numbers a(Si), i = 1, . . . , N in terms of the sequence of numbers 5(— 64), i = 1, . . . , N. This 
can be done by relating both sequences to the coefficients a,j,j = 0,...,N 
dos°. Let V(Si, . . . , Sn) denote the Vandermonde matrix 
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Using matrix-vector notation the following linear relations are obtained: 
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and 



a(-Si) 
a(-S N ) 



= V(-S 1 ,...,-5 N ) 



a 
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It follows that 



fi(*i) 



a(5 N ) 



= V(5 1 ,...,d N )V(-S 1 ,...,-6 N )- 1 



a(-6 N ) 



(11) 



Note that V{—8\, . . . , — 5n) is an invertible matrix because, by assumption, for alH = 1, . . . , N, j = 1, . . . , N, if i ^ j then 
Si ^ 5j and therefore we have det (V(—6i, . . . , — £jv)) = Iii<i<j<N(Si — Sj) ^ (cf. e.g. [27], p. 35). 
The first order equations can now be rewritten as 
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where diag(e(#i), . . . , e(£jv)) denotes the diagonal matrix with e(5i) in the (i, i)— entry, i = 1, . . . , N. 
This means that these first order equations can be written as 



x ^ 



where Xi = a(—Si),i — 1, . . . , N, x = {x\, . . . , Xn)' and 

M = diag(e(5i), . . . , e(5 N ))V{5 u . . . , S N )V{-S U -S N )-\ 
In the next section the solution of equations of the form found here will be treated in general. 

III. Diagonal-quadratic systems of equations 
In this section we will present results about an arbitrary system of polynomial equations of the form 
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where /i £ C N is a constant N— vector. This will be called a diagonal-quadratic system of equations. 



Remark. A quadratic equation in x can be written as x Ax 



cx 



d for some symmetric matrix A, a row vector c and 



a scalar d. If A = aef , for some i e {1, . . . , N}, then the equation is one of the form described above. If there are N 
quadratic equations and the corresponding A— matrices are all diagonal, and these diagonal matrices form a basis of the linear 
vector space of all diagonal N x N matrices then such a system can (obviously) be rewritten in the form above. That is the 
motivation for the terminology 'diagonal-quadratic'. 

In this paper use will be made of Grobner basis theory and constructive algebra. For an exposition of this theory one can refer 
to e.g. [11]. In Grobner basis theory an important role is played by the so-called monomial orderings. Let a = (a\, . . . , aw) 
denote an arbitrary vector of nonnegative integers, which will be called a multi-index in the sequel, then x a will denote the 
monomial x a :— x^x^ 2 ■ ■ -x^f . The multi-index a is called the multi-degree of the monomial x a . The corresponding total 
degree is defined as \a\ :— ct\ +a 2 + . . . + a^- For a general definition of monomial ordering we refer to [11], p.54, Definition 
1. 

A partial ordering of monomials is defined by x a y x@ if |a| > \/3\. Such an ordering is called a total degree ordering. For 
our purposes any complete ordering which is a refinement of the total degree ordering would do. For definiteness we choose 
to work with the graded lexicographic ordering, which refines the total degree ordering as follows: if \a\ — \/3\ then x a >- x 13 
if a.i > Pi for the smallest integer i G {1, . . . , N} for which on ^ fy. 

The total degree of a polynomial is defined as follows. Each polynomial is a unique linear combination of monomials with 
nonzero coefficients. The maximal total degree of these monomials is called the total degree of the polynomial. If we denote 



the i— th row of the matrix M by m t and the i— th entry of the vector \i by Hi for i e {1, . . . , N}, then the equations can be 
rewritten as 

xf — mix — Hi = 0, i = 1, . . . , N. 

Let gi(x\, . . . ,Xn) := x\ — rriiX — Hi,i = 1, . . ., N, then we are looking for the zeros of the ideal I spanned by G := 
{9i,92, ■ ■ -,9n}- 

Let < gi, . . . , gN > denote the ideal generated by the set of polynomials gi, . . . , g^. For a polynomial /, let LT(f) denote 
the leading term of /, and for an ideal / of polynomials, let LT(I) denote the set of all leading terms of the polynomials in 
/. 

Definition 3.1: For a fixed monomial ordering, a finite subset V = {71, . . . , 7^} of an ideal I is a Grobner basis if 

<LT{ ll ),...,LT{ lu ) >=<LT(I)>. 

Theorem 3.1: The set G is a Grobner basis with respect to total degree ordering. 
Proof. With respect to any ordering which is a refinement of partial ordering by total degree, the leading terms of G are 
monomials of the form xf. These are clearly pairwise coprime. But it is known that this implies that G is a Grobner basis [12, 
p.333, Ex.15.20]. □ 
An alternative but longer proof is available in [19]. 

This result is very important because to apply the results of Grobner basis theory one needs a Grobner basis. Usually one 
needs to apply an algorithm like Buchberger's algorithm to bring a set of polynomials that generates the ideal in which one is 
interested in Grobner basis form. In fact in many cases this is the most difficult part of the calculations. In the case at hand 
however the set of polynomials of which we want to find the zeros itself forms a Grobner basis. 

But that is not all. We can say more. We know that G = {g\, . . . , gN} forms a Grobner basis and that the leading monomial 
of gi is xf for each i = 1, . . . , N. Let C[xi, . . . , Xn] denote the ring of polynomials with complex coefficients. Let R denote 
the set of multi-indices R := {0, 1}^. In other words, R is the set of all multi-indices a = (ai, . . . , ajv) with the property that 
for each i = 1, . . . , N one has either Qj = or at = 1. Let Q denote the set of all multi-indices outside R. For each polynomial 
p = p(x) there exists a unique additive decomposition p = p R + p® , where the polynomial p R is a linear combination of 
monomials with multi-degree in R and p® is a linear combination of monomials with multi-degree in Q. 

Lemma 3.1: Let I denote the ideal generated by G. 

(i) The set V = V(I) of zeros in C N of the system of polynomial equations gi(x) = 0, i = 1, . . . , N, is finite. 

(ii) The C— vector space S = Span(x a : x a <^< LT(I) >) is finite-dimensional. 

(iii) The C— vector space C[xi, . . . , Xn]/I is finite-dimensional. 

(iv) The set of monomials {x a : a E R} forms a basis for the vector space S. 

(v) The dimension of the vector space S is 2 N . 

(vi) The dimension of the vector space C[xi, . . . , Xn]/I is 2 N . 

Proof. ad(i)-(iii). (i) — (iii) follow immediately from [11, Chapter 5, Theorem 6]. 
ad (iv). Because G is a Grobner basis the ideal < LT(7) > is equal to the ideal generated by the leading terms of the elements 
of G, i.e. the ideal < x\ , . . . ,x% > . The monomials in this ideal are precisely those which have multi-degree in the set Q. 
Therefore the monomials in S are the all the monomials with multi-degree in R. 

ad (v). From (iv) it follows that the dimension of S is equal to the cardinality of R, which is card(R) = 2 N . 
ad (vi). According to Proposition 4 of Chapter 5 of [11] the vector space C[x\, . . . ,Xm]/I is isomorphic to S and therefore 
has the same dimension as S. □ 
From [11], Chapter 5, Section 3, Proposition 1 it follows that every polynomial in C[x\, . . . , xjy] can be written in a unique 
way as the sum of an element of S and an element of I. In other words, each equivalence class / + 1, where / is an arbitrary 
polynomial in C[x\, . . . ,Xn], has a unique representative in S. Let this representative be denoted by 7r(/) € S. Given /, the 
polynomial n(f ) can be obtained by a general method from Grobner basis theory, namely the so-called division algorithm with 
respect to the Grobner basis G as described in e.g. [11]. However, for diagonal quadratic equations, the division algorithm 
simplifies considerably and one can describe in direct terms how one can obtain 7r(/) from /. The 'reduction procedure' can 
be described as follows. Using the same notation as above, one can write f = fQ + f R , where f R e S and the monomials 
of / Q all have multi-degree in Q. This additive decomposition is obviously unique. If fQ = then / = f R e S in which 
case 7r(/) = / and we are done. If ^ then consider any monomial of with total degree equal to the total degree of 
f®. By construction each such monomial is divisible by at least one of the monomials x\ , x\, . . . , x N . If it is divisible by xf 
then replacing it by the polynomial that is obtained by multiplying the monomial by the result is a polynomial / that is 

in the equivalence class / + I and which has the following property. Either the total degree of is smaller than the total 
degree of , or otherwise the total degree of J® is equal to the total degree of but the number of monomials in with 
total degree equal to the total degree of is one less than the number of monomials in with total degree equal to the 
total degree of f®. Such a replacement of / by / will be called a 'reduction step'. It follows that after a finite number of 
such reduction steps one arrives at a polynomial in the equivalence class f + 1 with the property that it lies in S. This is then 
the unique polynomial ir(f) that was sought for. 

The importance of this reduction procedure in our application will become clear in the examples section. 



IV. Commutative matrix solutions of polynomial equations 

In this section a method to obtain the solutions of a system of polynomial equations in several variables will be outlined. A 
method of this kind was originally developed by [29], [34]. A similar approach, but differing in some details, was developed 
by the authors of the present paper, is available in [19], and is the approach which will be summarized here. All proofs are 
omitted from this section since they are available in the works cited above. 

We will consider the situation in which the system of polynomial equations will have a finite number of solutions over the 
field of complex numbers C. In the modern constructive algebra approach to the problem of finding the roots of a system 
of polynomial equations the theory of Grobner bases plays an important role. For this theory we refer, as before, to [11]. A 
fundamental theorem of the theory of Grobner bases is that for any polynomial ideal given by a finite number of polynomials 
which generate it, a Grobner basis can be calculated with respect to any admissible monomial ordering (like the lexicographical 
ordering or the total degree ordering) in a finite number of steps. It can for example be obtained by Buchberger's algorithm. 
However the number of steps required by such an algorithm can be huge. In the literature it is suggested that in order to obtain 
the roots of a system of polynomial equations, one can construct a Grobner basis with respect to a lexicographical ordering 
[11, p. 233], [15, pp. 459-462]. Also in the paper [18] this approach was followed to show that under two hypotheses described 
in that paper, the H 2 model order reduction problem can be solved in a finite number of steps. However only examples of 
reduction of third order models were presented in that paper. The bottle-neck in the calculations was the construction of a 
Grobner basis. In the previous section it was shown that for the problem of reduction of the model order by one with respect 
to the H 2 norm, in case of an original model with distinct poles, the first order equations found already are in the form of a 
total degree Grobner basis, so no Grobner basis construction at all is required in the application at hand. 

The idea is first to construct a commutative matrix solution for a system of polynomial equations which is in Grobner basis 
form. 

Definition 4.1: Let N be a positive integer. Let / G C[xi, . . . , xjv] be a polynomial in the variables x\, . . . , xn- Let M be 
a positive integer and consider an N— tuple (A\, A 2 , . . . , An) of square M x M matrices that commute with each other, i.e. 
AiAj — AjAi for each pair (i, j), i = 1,. . . ,N, j = 1, . . . , N. Then (A\, A 2 , . . . , An) will be called a commutative matrix 
solution of the polynomial equation / = if f(A\, . . . , An) — Om, where the symbol Om denotes the M x M zero matrix. 
In the following, an M x M zero matrix will often be denoted by the symbol 0, as is usual, instead of the symbol Om- The 
size of the zero matrix should then be clear from the context. An N— tuple of M x M matrices (A\, . . . , An) will be called 
a commutative matrix solution of a system of polynomial equations in N unknowns x\, . . . , Xn, if it is a commutative matrix 
solution for each of the polynomials in the system. 

From a commutative matrix solution a scalar solution can be obtained by considering any common eigenvector of the 
matrices. The corresponding eigenvalues form an AT— tuple which is in fact a scalar solution of the system of polynomial 
equations involved. The commutative matrix solution that will be constructed here for the case of ideals with zero dimensional 
variety, has the property that ALL (scalar) solutions can be obtained in this way. 

It will first be explained how such a commutative matrix solution can be constructed. Then it will be shown how the (scalar) 
solutions of the system of polynomial equations can be derived from the matrix solution by eigenvalue-eigenvector calculations. 
If T is a field containing all the coefficients of the polynomial equations then all the entries of the matrix solution that will 
be constructed will be contained in T\ in other words, only additions, subtractions, multiplications and divisions are required 
to obtain a matrix solution. 

We start with two results which hold for an arbitrary polynomial ideal. For these results to hold, the ideal does not have 
to have the property that the number of common zeros of the polynomials in the ideal is finite. The two results consist of a 
number of observations concerning the operation 'multiplication by Xi modulo the ideal', for i G {1, . . . , A}. Composition of 
a pair of mappings X 1 Y will be denoted (as usual) by X o Y. 

Theorem 4.1: Let A be a positive integer. Let I C C[xi, . . . , x N ] be an ideal and let V :— C[xi, . . . ,xn]/I denote the 
corresponding residue class ring. Let i G {1, . . . , N} be fixed. Let f\, f 2 G C[xi, . . . , xn}. If /i and f 2 are equal modulo 
7, then Xifi and Xif 2 are equal modulo 7. The mapping Xi : V — > V, / + 7 i— > Xj/ + 7, is a linear endomorphism. For 
i, j G {1, . . . , N} arbitrary, Xi o Xj = Xj o Xi i.e. the linear mappings Xi and Xj commute. The mapping Xi o Xj is the 
mapping given by / + I XiXjf + I. 

For any pair of linear endomorphisms X, Y let us interpret XY as the composition X o Y, let us interpret X° as the 
identity and for each positive integer k, let us interpret the power X k as the fc— fold composition X o X o . . . o X. Using this 
interpretation for any A— tuple of commutative linear endomorphisms X\,. . . ,Xn and any polynomial / G C[xi, . . . , Xn], 
the expression f(X\, . . . ,Xn) denotes a well-defined linear endomorphism. 

Theorem 4.2: Let A, 7, V and Xi, i = 1, . . . , A be as given in the previous theorem. For any polynomial / G C[xi, . . . , xn] 
the linear mapping f(X\, X 2 , . . . , Xn) ■ V i— ► V is well-defined. 

The following two statements are equivalent, 

(i) / G 7, 

(ii) f(Xi, . . . , X N ) is equal to the zero mapping V — > V, / + I ^ + I. 



Now we will specialize to systems of polynomial equations with finitely many common solutions. We will make extensive 
use of the results from section 3 of Chapter 5 of [11], pp. 228-235, especially Propositions 1 and 4 and Theorem 6 of that 
section. 

Let gi(x\, . . . ,xn) = 0, . . . ,gN'(xi, ■ ■ ■ ,%n) = denote a system of N' polynomial equations with complex coefficients 
in the N variables x\, . . . , xn- The complex vector (£i, . . . , £jv) S C N is a root of the system of polynomial equations if for 
each j — 1, . . . , N', 

9j(£u ■■■,&) = 0. 

Let I =< gi, . . . ,g^i >C C[xi, . . . , xm] denote the ideal generated by the polynomials gi{x\, . . . , xjv), . . . , gN'{x\, ■ ■ ■ ,Xn). 

Suppose that G = {gi, . . . ,gN'} is in fact a Grobner basis for /, with respect to some fixed monomial ordering. Similarly 
to what was noted in the previous section for the special case of diagonal-quadratic systems of polynomial equations, the 
following can be said for this more general case. Each polynomial / G C[xi, . . . , xjv] is congruent modulo I to a polynomial 
r with leading term that cannot be reduced by any of the leading terms of the polynomials in the Grobner basis; for each 
/ the associated polynomial r is unique [11, Chapter 5, Section 3, Proposition 1] and will be denoted by / . The set V 
of all polynomials r obtained in this way forms a finite dimensional vector space if and only if the number of roots of the 
system of polynomial equations is finite. If this set is indeed a finite dimensional vector space, then it has a basis consisting 
of monomials, namely all monomials that cannot be reduced by any of the leading terms of the polynomials in the Grobner 
basis. This result is due to Macaulay [12, Theorem 15.3, p. 325]. Given the monomial ordering it is a straightforward task to 
list these monomials ( [11]). Let this basis be denoted by B. The mapping V —> V, r i— ► r + I, is a linear bijection of vector 
spaces. In case V is finite dimensional, let B denote the basis of V obtained as the image of B under this mapping. Let D 
denote the dimension of V. 

For each i G {1, . . . , N} let denote the D x D— matrix of the endomorphism Xi with respect to the basis B. 
Using this set-up the following fundamental result can be obtained. 

Theorem 4.3: Let a monomial ordering be fixed and let G be a Grobner basis of the ideal /. Let the associated linear 
space V be finite dimensional with dimension D. Let / G C[x\, . . . , xn] be given. Let the mappings Xi, i = 1, . . . , N and 
f(X\, X2, ■ ■ ■ , Xn) be as given in the previous theorems. 

The matrix of the linear mapping f{X ll X 2 , ■ . . , Xn) ■ V — > V with respect to the basis of monomials B of V is equal to 
f(A Xl ,Ax 2 ,...,A XN ). 

The following two statements are equivalent, 

(i) / G I, 

(ii) f(Ax l: Ax 2 , ■ ■ ■ , Ax N ) = 0, i.e. this matrix is the D x D zero matrix. 

This theorem tells us that the N— tuple of matrices (Ax 1 , ■ • ■ , Ax N ) is in fact a commutative matrix solution of any system 
of polynomial equations that generates /. 

The entries of the k— th column of the matrix Ax, are obtained as follows. Let the fc— th element of the basis B of monomials 

Q Q 

be denoted by bk- The monomial Xibk is either itself in the basis B or otherwise Xibk ^ xib^. In both cases Xibk can be 
written as a unique linear combination of the elements of B. The coefficients of the linear combination are the entries of the 
A;— th column of the matrix Ax t - If x^k is itself in the basis B then the k— th column of the matrix Ax t is a standard basis 
vector. 

In the case N = 1 then there exists a unique monic polynomial g such that / is generated by g. In that case the matrix 
Ax 1 is a companion matrix of g (cf. e.g. [27, p. 68]). 

Now suppose that the vector v is a common eigenvector of the matrices Ax 1 , • • • , Ax N with corresponding eigenvalues 
£1, £2, • ■ • , £,n, respectively, i.e. for each i G {1, . . . , N} the equality Ax { v — &v holds and v ^ 0. Then for each f E I one 
has 

= f(A Xl ,...,Ax N )v = f(^,...^ N )v 

and therefore . . . , £n) — 0. It follows that (xi, . . . , xjy) = (£1, ■ • ■ , £iv) is a root of any system of polynomial equations 
that generates the ideal /. 

The following fundamental result states that in fact each of the finite number of roots is obtained in this way. 

Theorem 4.4: Let N be a positive integer and let I be an ideal in the ring C[xi, . . . , Xn] such that the corresponding set 
Z G C N of common zeros of all the polynomials in / is finite. Let Xi, i = 1, . . . , N be as defined above. Then for each vector 
£ = (£1, . . . , G Z there exists a polynomial w G C[xi, . . . , xjv], w /, with the property that for each i = 1, . . . , 7Y, the 
following equality holds: 

Xi(w + I)=&w + I, 

i.e. w is a common eigenvector of the mappings X\, X2, . ■ . , Xn, with corresponding eigenvalues £1, . . . , £jv, respectively. 
From this theorem we have the following important corollary. 

Corollary 4.1: Let N, I and Z be as given in the previous theorem. Let Xi, i = 1, . . . , 7Y be as defined above. Let a 
monomial ordering be given and let G be a Grobner basis of I with respect to this monomial ordering. Let B denote the 
basis of all monomials in C[xi, . . . , xn] that are not included in the ideal < LT(G) > generated by the leading terms of the 



elements of G and let B denote the corresponding basis of the residue class ring C [x\ , . . . , xjv]//, as before. Let Ax x , ■ ■ ■ , Ax N 
denote the matrices of the linear endomorphisms X%, . . . ,Xn, respectively, with respect to the basis B. Then the following 
two statements are equivalent. 

Ci) £ =(&,..., frr)'ez. 

(ii) There exists a common eigenvector v G C N \ {0} of the (commutative) matrices Ax 1 , ■ ■ ■ , Ax N with corresponding 
eigenvalues £1, . . • , £jv respectively, i.e. there exists a nonzero vector v with the property 

A Xi v = iiV, i = l,...,N. 

Various alternatives arise as to how to exploit the theory presented here to solve a system of polynomial equations, starting 
with a Grobner basis. The commutative matrix solution presented can be calculated in symbolic form if the original system 
of equations is in symbolic form and it can be calculated exactly in numerical form if the coefficients of the original system 
of polynomials are given numerically. From the commutative matrix solution the roots of the system of polynomial equations 
can be obtained either by exact algebraic methods or by numerical methods that involve round-off errors. The exact algebraic 
approach will not be worked out here. 

A (nonexact) numerical approach can be based on numerical calculation of the eigenvalues and eigenvectors of the matrices 
involved. In the examples section this approach will be applied to the H2 —model order reduction problem. 

The possibility of using a mixture of exact and symbolic calculations with numerical calculations is very promising for 
obtaining practically useful results. The matrices involved will tend to become huge (in terms of numbers of entries) if the 
number of variables involved grows; however eigenvalue calculation can be done numerically for quite big matrices. In section 
IVIII matrices with several hundreds of rows and columns are used. One can expect that usage of more refined numerical 
techniques will make it possible to push the limits quite a bit further. 

Let / G C[xi, . . . ,Xn] and let F be the corresponding linear endomorphism of C[xi, . . . , x n]/I defined by g+1 1— > f.g+I. 
If the number of common zeros of the polynomials in / is finite, and we have a basis B of C[xi, . . . , xn]/I as before, then we 
can represent F with respect to this basis by a matrix Ap . It is now straightforward to see that Ap = f {Ax x , Ax 2 , ■ ■ ■ , Ax N ) ■ 
More generally if / = /«, /d 6 C[xi, . . . , Xjv] and fd(0 ^ for each common zero £ of the polynomials in /, then F 
and Ap are again well-defined and Ap — f n (Ax x) ■ ■ ■ ,Ax N )- (fd{Ax 1 , ■ ■ ■ ^Ax^)" 1 ■ The eigenvalues of this matrix Ap 
are {/(£)!£ e Z}. For example in optimization problems in which the criterion function /, say, is a rational function this can 
be used to obtain the matrix Ap which has as its eigenvalues the critical values of /. (The values that a function takes on its 
set of critical points are called the critical values.) The matrix Ap could be called a critical value matrix and its characteristic 
polynomial a critical value polynomial. This is related to Theorem 9 and the subsequent Remark 10 in [18] concerning the 
existence and usage of a univariate polynomial which has the critical values of the criterion function as its zeros. 

V. Model order reduction by one in H 2 

Recall the formulation of the H2 model reduction problem from Section|II] In order to facilitate the statement of the following 
theorem let us define the set 3 as follows. Let | G HSjv have N distinct poles 6i,...,5n G C. Let the matrix M be as 
given in equation (TT~4-b and let 3 denote the set of solutions in \ {0} of equation ( TT~3T >. The diagonal quadratic system of 
equations ( TT~3T > is shown to form a total degree Grobner basis in Theorem 13.11 In Lemma 13.11 a basis of 2 N monomials of the 
corresponding vector space S is presented. This basis consists of the monomials outside the ideal generated by the leading 
terms of all polynomials in the ideal corresponding to the diagonal quadratic equations. Let this basis be denoted by B. Then 
Corollary 14. II can be applied to ( TT3l > using the basis of monomials B. The implication is that in this case the set 3 just defined 
is equal to the set Z of that Corollary, except that the zero vector is removed: 

S = Z\{0} 

It follows that 3 contains at most 2^ — 1 elements, each of which is a vector of N entries that can be found as the eigenvalues 
corresponding to any common eigenvector of the matrices Ax x , ■ ■ ■ , Ax N from Corollary 14. II We therefore have the following 
theorem 

Theorem 5.1: Let % G £SV have N distinct poles 6%, . . . , 5n £ C. 

(i) The number of critical points of the criterion function / : ESV-i — * [0, 00), i h-> ||£ - i|g is finite and not greater than 
2 A ' - 1. ' " 

(ii) If the rational function - G SSV-i is a critical point of / then there exists a number qo and a vector £ G 3 C C N \ {0} 
such that qoa(-Si) = i = 1, . . . , N. For given qo an d £ the polynomial a is uniquely determined by this linear system 
of equations and b is uniquely determined by equation (@J. 

Of course the solutions that will be found for the first order equations will in general not all correspond to rational functions 
- G ESiv-i: it is certainly possible that some will not correspond to real systems; some may correspond to real but unstable 
systems. 

An algorithm to obtain all the critical points of the criterion function of H2 model reduction by one could now be constructed 
as follows. 



1) Construct the matrix M. 

2) Construct the matrices Ax 1 , ■ ■ ■ , Ax N ■ 

3) Calculate the eigenvalues of these matrices that correspond to a common eigenvector of all these matrices. The result is 
a vector £ G C N . All nonzero vectors £ obtained in this way form the (finite) set 3. 

4) For each element of 5 solve equation ( fT~3b for a and qo, and select those a that are real and Hurwitz. 

5) For those a selected in the previous step, solve equation for b. 

Note that steps (1) and (2) can be done purely symbolically. Apart from considerations of memory storage and perhaps 
calculation time, it is not necessary to specify the original system; one can present it symbolically by its poles and the (non-zero) 
values of the numerator polynomial in these poles. 

If the original system is specified numerically then step (3) can be worked out by either constructive algebra algorithms 
(using e.g. methods of isolation of zeros of polynomials) or by numerical algorithms that admit round-off errors. In section 
VIII we present some results obtained by calculations of the latter type. 

Step (4) requires that we go through the solutions in 3 to find out those that are admissible and a solution is admissible if 
a is both real and Hurwitz. This can be done by first eliminating all the complex a's and then checking whether the real a's 
are Hurwitz. 

Note that the pairs a, b found in Steps (4) and (5), respectively, are coprime as a consequence of equation 10} , and that b is 
real, and hence that - e 

The global approximant is found by selecting from the finite set of critical points the point that minimizes the criterion 
function / defined in Theorem 15. II This follows from the fact that this criterion function / is differentiable everywhere and has 
a global minimum (cf. [2] and the references therein). The global approximant can now be found by choosing the admissible 
solution that gives the least criterion function. 

This process can be simplified, since one is interested in locating only the global approximant. We shall show that it is 
possible to construct a matrix, the eigenvalues of which include the values of the criterion function / at the critical points. One 
can therefore search among these values, starting with the smallest positive real value, until one finds one which corresponds 
to an admissible approximant. This will be the optimal approximant. As will shortly be shown, the attraction of this approach 
is that many elements of 3, namely those which yield complex value of / and those which correspond to non-Hurwitz a 
polynomials, will never be visited by this procedure. 

For any rational function t for which the Lebesgue integral ^- J_ \t[iuj)\ 2 dui is finite let us define the L2— norm ||t||2 by 



t\\i ■■= tt / l*MI 2 ^- 



1 

2~n 

Note that for any rational function t in H2 this definition coincides with the definition of \\t\\2 given before. We have the 
following theorem. 

Theorem 5.2: Let ^ 6 YjSn have N distinct poles 6i,...,8jf 6 C. 

Let a(s),b(s),qo be a real solution of the polynomial equations Q,©, then 

1) 



e(s) b(s) 



d(s) a(s) 



N 

E 



(16) 



where x t = a(-5i) = q Q a(-5i), i = 1, . 



,N, (as before) and d'(s) denotes the derivative of d(s) with respect to s. 

2) If a(s) is Hurwitz then the L2-norm computed above coincides with the i?2-norm. 

3) If a(s) is not Hurwitz then the L2-norm squared computed above is strictly greater than the global minimum of the 
criterion function / as defined in Theorem 15. II 

Proof. Let us first prove part 1 of the theorem. Due to the first order condition ||3}, combined with the equality q(s) = qo 
and combined with the assumption that e,d,a,b are real polynomials, and combined with the fact that d and a are monic 
polynomials and therefore unequal to the zero polynomial, one has 
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The residue theorem of complex analysis can now be applied. We use the fact that limisi^oo s 2 ( ^( s )^(_sj J = 1 to argue that 
the integral over the imaginary axis is equal to the integral over a sufficiently large semi-circle together with a sufficiently 



duj. 



large segment of the imaginary axis. This is a standard argument in complex analysis that we will not repeat here (see e.g. 
[30]). The residue theorem now tells us that the integral is equal to 
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The first order conditions (O can be rewritten as 



Substituting this and using cc, = a(— <5j) it follows that 



a( S i) = "^f'? ,i = l,...,N, a ^ 0. 



E 



I, e(g) _ H£)n2 _ 

ll d(s) a( s y 2 ^e(6i)dr(Si)d(-6i 
This shows 1. 

Part 2 of the Lemma follows immediately from the fact that the L2 norm and the H2 norm coincide for all elements in H^- 
(See also the remark made above after the definition of the L2 — norm). 

Proof of part 3: Suppose that a is not Hurwitz. Then it can be factored uniquely as a = 0102, where a\ and a-i are monic 
and a\(s) and ai{— s) are Hurwitz polynomials in the variable s, with deg(ai) < n. There are corresponding polynomials 
61,62 with deg(6i) < deg(ai) and deg(&2) < deg(ct2) such that = a^(s) a|(fj ■ ^ s * s we H-known (and following from 
Cauchy's theorem in complex analysis) 

1 r ^M-^\ du=0 



doj = 0. 



27r J_ oc ai{itjj)a2{~iuJ 
and similarly 

1 f 00 e(iu)b 2 (-ioj) 
2n d(iuj)a 2 {-ioj) 

From this well-known orthogonality property in L2 it follows that 
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This number is larger than the global minimum of the function / of Theorem 15.11 because is the transfer function of 

a system of order < n. As noted before it is well-known that the H2— norm squared of the difference between the original 
system and an approximant of order < n, is always larger than the global minimum of the H2— norm squared of the difference 
between the original system and an approximant of order n. This finishes the proof of part 3 and of the Theorem. 

□ 

For any complex polynomial p G C [s] let p denote the polynomial that is obtained when the coefficients of p are replaced 
by their complex conjugates. I.e. p is the polynomial with the property that p(r) = p(r) for all r G R, where s denotes the 
complex conjugate of a complex number s. 

Lemma 5.1: Let ^ G T,Sn have N distinct poles Si, ... ,5^ G C. 

Let a(s),b(s),qo be a complex solution of the polynomial equations (HJi,®. Then a(s) ,b(s) ,qo is also a solution. 

The corresponding numbers J2?=i e(8?\<v(s/)d\~8,) and £i=i e (S^{S^d(-Si) form a com P lex conjugate pair. In particular 
this implies that if one of these numbers is real the numbers are equal. 



Proof. Consider a complex solution a(s),b(s),qo of the first order equations e(s)a(s) — b(s)d(s) = a(—s) 2 qo. Because 
polynomials are completely determined by their restriction to the real numbers, an equivalent formulation of the first order 
equations is e(r)a(r) — b(r)d(r) — a(—r) 2 qo for all r G R. Complex conjugation of these equations gives e(r)a(r) — b(r)d(r) = 
r ) 2 Q0y which shows that a(s),b(s),qo is also a solution. 

Because h is a real polynomial with distinct zeros the set of zeros of h consists of an even number, 21, say, of complex 
solutions and n — 21 real solutions. The 21 complex solutions can be partioned into I pairs of complex conjugate solutions. It 
is easy to see that for each real zero 8 of h, 

qoa(-S) 3 
e(S)d'{S)d{-6) 

and 



qoa(-S) 



3 



e(6)d'(S)d(-S) 

is a complex conjugate pair. And if S, 5 is a complex conjugate pair of zeros of h, then the complex conjugate of 

q a(-5) 3 qaa(-I) 3 



is equal to 



Combining this it follows that 



and 



form a complex conjugate pair. 
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q a{-Si) 3 
e(5i)d'(8i)d(-5i) 

□ 



For ease of reference, let 4> be defined by <j> : S — ► C, x i— > ^2^ =1 



e(5i}d'(Si)d(-Si) ' 

Using the results above one can find the global minimum of the criterion function as follows. For each of the at most 
2 N — 1 elements of H, evaluate the numbers <j>(x) G C. At least one of these numbers will be real and positive. Let k denote 
the number of distinct real positive numbers obtained in this way and let us denote these numbers by mi , . . . , m^ where 
mi < . . . < TOfc. Consider the set _1 (mi). If each £ G $r l {m\) corresponds to a complex non-real solution a(s),b(s),qo 
of the polynomial equations (0J,(|5j, there must be an even number of such solutions, as a result of Lemma (15.11 ). If any of 
the solutions is real then according to Theorem 15.21 the global minimum is equal to mi and all real solutions a(s),b(s),qo 
that correspond to this number are global approximants. If none of the solutions that correspond to £ G (/> _1 (mi) are real then 
consider the set <fi~ 1 (ni2). If any of the corresponding solutions a(s), b(s), qo is real then mi is the global minimum, otherwise 
consider the solutions that correspond to 7713 etc. One of the numbers mi, . . . , m* is the global minimum and therefore the 
global minimum will be found in this way. It follows from Theorem 15. 21 that all real solutions a(s), b(s), qo that correspond 
to the global minimum are in fact admissible, i.e. a(s) is Hurwitz and a(s) and b(s) are coprime. 

Remark. Note that the function is a polynomial and therefore continuous and smooth. Depending on the size of the 
coefficients g7T7Tg77^w3g7^ a perturbation in x due to numerical round-off error may cause a limited perturbation in the 
corresponding value of <\>. This implies that if the size of the coefficients just mentioned is not too big, and the perturbations 
in x are limited then the effects of round-off error on the calculated critical values are limited. This can be contrasted with the 
possible effect of perturbations on the calculation of the critical points. Especially if a critical point G T,Sn-i has poles 
near the imaginary axis, a small perturbation may produce a denominator polynomial with one or more right half-plane zeros, 
and therefore an inadmissible system, outside the manifold ESjv-i. Note that even if due to round-off error our algorithm 
would not produce a reliable global approximant, knowledge of the value of the global minimum of the criterion function 
could be used to evaluate the performance of other algorithms for the H2 model order reduction problem. 

Remark. The formula for <f> in the Theorem can be used to build the critical value matrix Ap that was mentioned at the end 
of the previous section, by taking the polynomial / mentioned there equal to <j>. Note that because is a polynomial no matrix 
inversion is required in the calculation of Ap — 4>{Ax 1 , ■ ■ ■ , Ax N ). The matrix Ap can also be built up by direct construction 
of the matrix of the endomorphism F with respect to the basis B of monomials defined earlier. 



VI. Repeated poles 

In this section we briefly outline how the development is changed if any of the poles of the original system are repeated, and 
indicate the additional difficulty which arises in that case. For simplicity of exposition we assume that one pole has multiplicity 
two: Si = &2, and the other poles are distinct. In this case (0 gives only N — 1 independent equations. An additional equation 
is obtained by differentiating J3), which leads to 

e(5 1 )a'{S 1 ) + e'(5i)a(5i) = -2a{-5x)a! {-8 X ) (17) 

(Note that we have used d{8\) — d'(S±) — here.) Taking x\ = a(— <5i), = a'(—Si), x.- L = a(—6i) for i = 3, . . . ,N, one 
obtains again a system of A quadratic polynomial equations in x%, ...,xn representing the first-order conditions. 

This system of equations will not yet be in Grobner basis form, in contrast to the case of distinct poles. So at this point 
it is necessary to employ Buchberger's algorithm to obtain a Grobner basis for the corresponding ideal. Subsequently the 
Stetter-Moller matrix method can again be used to find the critical points and hence the global optimum, provided that the 
number of critical points is finite. As far as we are aware, there is as yet no guarantee that this is the case. 

If S\ has multiplicity greater than two then higher-order differentiation of (0 is needed, but otherwise the generalization is 
rather straightforward. If there are several repeated poles a similar approach can be followed. 

VII. Examples 

A. General 

This section presents two examples on solving the H2 model reduction problem and discusses several computational issues. 
The following is an outline of the algorithm implemented: 

1) For the given A-th order transfer function to be reduced, construct the A-by-A matrix M (see equation (TOTi). 

2) For i = 1, A, construct the 2 JV -by-2 JV matrix Ax t from M (see Theorem 14.31 and the following paragraph, and note 
that the reduction procedure of section [ill] is crucial in enabling this to be done). 

3) Compute the eigenvalues and eigenvectors of all the ^x.'s. Assume, for simplicity, that each Ax t has a simple Jordan 
structure. Arrange these eigenvalues and eigenvectors such that the j-th eigenvector of Ax t corresponds to the j-th 
eigenvector of Ax i2 for all j = 1, ...,2 and = 1,..., A. Letting £y denote the j-th eigenvalue of Ax ( , form 
the A-tuples . . . , £,N,j) for j = 1, 2 N . Now each of these A-tuples contains the eigenvalues that correspond to 
one of the common eigenvectors of the set {Ax;}- Our current implementation of this step uses numerical methods, so 
there are potential problems which can arise if eigenvalues and/or eigenvectors are repeated, or nearly so. We have not 
attempted to cope with all such eventualities. 

4) Solve for Sj, using equation (flOl l, by making the association 

[a(—Si), . . . ,a(-S N )] = [£ lt j, . . .,£nj]- 

Normalise the coefficients such that a/v-i = 1 to obtain a{. Eliminate those polynomials a(s) = s w_1 + apf-2S N ~ 2 + 
. . . + Oo which are not admissible pole polynomials of an approximating system, because they are not real Hurwitz. 

5) For each admissible pole polynomial a(s), obtain the zero polynomial b(s) from equation (0J. In practice the equation 
does not hold exactly, so a least-squares solution is found. 

All the above steps except that of computing eigenvalues and eigenvectors can in principle be performed symbolically. Two 
different implementations have been attempted and they differ only in whether step|2]is performed symbolically or numerically; 
note that steps [5] and |4] are done numerically here. For the symbolic implementation of step [2] the Ax t 's are computed from 
a symbolic definition of M = [rrijk] using computer algebra softwar^H and the resulting symbolic expressions for the Ax t 's 
(see the Appendix) are stored in a file to be read in by numerical softwar^l later. This has the advantage that the symbolic 
computation only has to be performed once for a given model order. Unfortunately, the length of these symbolic expressions 
soon becomes very large as the model order increases; the size of the file storing these expressions comes to more than 5 
Mbytes for model order equal to 7 and this thus presents a practical limit to this implementation. Alternatively, due to the 
simplicity of the reduction procedure (see section ITTIli. step [2] can be implemented in a straightforward manner in a numerical 
package 2 . In this case, the highest model order that we could reduce is 9, which involves storing 9 512 x 512 matrices, and we 
ran into memory problems for model orders higher than this. The computer we used was a Sun Ultra 10, 300 MHz processor 
with 640 MByte RAM. 

There are a number of numerical issues pertaining to this algorithm. Some of these issues are well known, e.g. possible 
ill-conditioning of Vandermonde matrices and the computation of eigenvalues and eigenvectors. These numerical problems will 
also cause difficulty in later steps of the algorithm. For example, numerical error may cause us to regard a real polynomial 
as complex in step [4] and as a result, a true local minimum of the problem may be mistakenly considered as inadmissible. 
The current implementation of this algorithm does not strive to overcome nor detect these problems. It is also beyond the 

'in our case, Maple. 
2 In our case, Matlab. 



scope of this paper to give full numerical analysis of the proposed algorithm of this paper. A rudimentary check that we have 
employed is to examine the least-squares error in step [31 however, this error must be interpreted with care as a small residual 
error does not necessarily indicate an accurate solution [17]. Moreover, this check will not be able to tell us whether a correct 
solution has been rejected. We have applied our algorithm to the three third order systems that were investigated in [18] where 
a symbolic algorithm was used to reduce them to second order systems. In this case, symbolic computation ensures that all 
stationary points of the problem are computed and we find that the algorithm of this paper is able to find the same sets of 
critical points as those reported in [18]. This comparison may indicate that our algorithm is likely to return the entire set of 
stationary points when the model order is small. 

B. Example 1: An easily reduced system 

The system to be reduced is a 9th order transfer function and it is the highest order model that we could reduce thus far. 
This system has Hankel singular values 9, 8, . . . , 2, 1 and its transfer function is 

8.4800s 8 -2.5942s 7 + 153.5350s 6 +38.8803s 5 +599.3205s 4 + 196.3752s 3 +315.3021s 2 +6.4558s+9.4478xlQ- 5 
s 9 +2.1179s s + 16.1278s 7 +25. 6052s 6 +62. 7884s 5 +79. 1895s 4 +42.6617s 3 +32.5279s 2 +0.2514s+2. 2495x10-" 

The algorithm finds 8 admissible stationary points altogether. The best approximant is 

8 4799s -2 5955s 6 + 153. 5327s 5 +38. 8546s 4 +599.3039s 3 + 196.2798s 2 +315.2701s+6. 4351 
s s +2.1176s 7 + 16.1275s e +25.6013s 5 +62.7850s 4 +79.1756s 3 +42. 6527s 2 +32. 5215s+0. 2499 

and it gives Hi model reduction error of 0.0344 and in comparison with the norm of the original transfer function 8.8261, 
this gives a relative error of 0.39%. Note that the coefficients of this approximant are very similar to those of the original 
transfer function and this can be accounted for as follows: the original transfer function has a pole at —8.9582 x 10~ 6 and a 
zero at —1.4645 x 10~ 5 . The model reduction algorithm appears to have removed this very closely spaced pole-zero pair and 
to have left the other poles and zeros nearly unchanged. The other seven approximants give errors of 0.8703, 0.8707, 1.6463, 
1.6466, 1.6536, 1.6538 and 1.6650. Provided that all the stationary points of this optimisation problem have been computed, 
then the solution that gives the minimum error is in fact the global minimum of the problem. The other stationary points may 
correspond to local minima, saddle points or even local maxima. 

C. Example 2: A relaxation system 

The system to be reduced is taken from p. 162 of [39] and is given by 

N 2] 

G(s) = V^-^- witha>0. (18) 
A — ' s + or- 3 
i=i 

It is shown in [39] that all the Hankel singular values of this system tend to | as a — > oo. On the other hand, when a ~ 1 
and N > 1, the system is close to non-minimality as a = 1 gives rise to a first order system. Our algorithm has numerical 
difficulty when a is chosen either too large or too close to 1 . In both cases, the Vandermonde matrix becomes ill-conditioned: 
the rows contain entries of drastically different magnitude in the first case and the poles are too close to each other in the 
second. 

Since the poles of this system are all real, this gives rise to a real M matrix and in turn real Ax/s. Due to the form of 
Grobner basis defined by M, zero is always an eigenvalue of Axi (independent of whether M is real or complex). Since the 
dimension of Ax t is 2 N — an even number — and Ax t is real, Ax 4 must have at least one other non-zero real eigenvalue. 
For a close to zero or unity, we find in our examples there is a real eigenvalue that is approximately zero and the eigenvectors 
corresponding to this eigenvalue and the zero eigenvalue are almost parallel to each other. This gives rise to difficulty in 
matching the eigenvectors. 

For model order N = 5, our algorithm succeeded in finding an approximant for systems with a in the interval [0.38, 0.79] but 
failed in the intervals (0, 0.38) and (0.79, 1). For a in the intervals (0, 0.38) and (0.84, 1), our algorithm returns no solution as 
it either has difficulty in matching the eigenvectors or has rejected the admissible solutions because they are not real Hurwitz. 
Our algorithm does return a solution for a £ (0.79, 0.84] but a closer examination of the obtained approximant shows that it is 
not a relaxation system. Since the system in equation ( fT8l is a relaxation system and it is proved in [4] that H2 approximants 
of relaxation systems are also relaxation systems, it implies that the solution given by our algorithm for this range of a is 
unacceptable. 

It is also shown in [4] that any stable relaxation system, whose poles all have modulus smaller than w 0.707, has only 
one admissible solution of the first-order optimality conditions. For a = 0.78, the largest pole is 0.6084 and there should 
therefore be only one such solution. For this case our algorithm returns precisely one admissible solution, in accordance with 
this theory. It has absolute error 0.0334, which can be compared to the norm 1.6980 of the original system to give a relative 
error of 1.96%. The transfer function of this approximant is 

1.4240s 3 + 1.0946s 2 +0.2371s+0.0134 
s 4 + l. 1781s 3 +0.4457s a +0.0627s+0. 0028 " 

which can be shown to be a relaxation system. 

As an alternative to the algorithm described at the beginning of this section, we have also treated Example 2 using an 
algorithm based on building up the critical value matrix using dT5l . The same results were obtained with both algorithms, 
except when a was very close to 1. For example with N = 2 and a = 0.999 the first algorithm continued to give the correct 
result (which was checked using exact algebraic calculation) but the second did not, because of numerical imprecision. 



VIII. Conclusions 



The application of constructive algebra methods to the H2 approximation problem offers the possibility of guaranteed location 
of the globally optimal approximant, despite the fact that this is a non-convex problem. Furthermore, the location of this optimal 
approximant could, in principle, be computed to any desired precision, by employing 'symbolic' methods throughout. 

One can envision, however, that these methods could be used in conjunction with more conventional numerical methods in 
a number of ways, to obtain various precision/efficiency trade-offs. One possibility is the one used by us to solve the examples 
in this paper, namely to employ conventional numerical eigenvalue solvers from the point at which the matrices Ax t have 
been determined. Another possibility would be to use constructive algebra methods to obtain an upper bound for the number 
of admissible critical points, and/or the value of the criterion function at the optimal approximant (which can be done without 
computing the optimal approximant itself), and to use these results to check the candidate optima obtained by a conventional 
numerical optimization approach. 

It should be kept in mind that constructive algebra also offers the possibility of dealing with purely symbolic problem 
specifications — that is, of producing 'generic' results (for all transfer functions of a given order, say) rather than results for 
one specific system. This can be done in principle, although in practice the complexity of the required computations is well 
beyond current possibilities. 

The commutative matrix approach which we have used to solve the system of critical-point (polynomial) equations is 
currently the subject of intense research activities in the computer algebra community, and in the systems theory community 
[7], [8] with good prospects of much more efficient algorithms being developed. We therefore expect that it will soon be 
possible to approximate higher-order systems than the ones we have been able to tackle in this paper, using essentially the 
same methods. Also, we expect that such developments will make constructive algebra methods attractive and feasible tools 
for tackling a wider range of problems in systems and control theory. 
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